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Abstract. The duality theory of the Monge-Kantorovich transport problem is 
analyzed in a general setting. The spaces X, Y are assumed to be polish and 
equipped with Borel probability measures fi and v. The transport cost function 
c : X x Y — > [0, oo] is assumed to be Borel. Our main result states that in 
this setting there is no duality gap, provided the optimal transport problem is 
formulated in a suitably relaxed way. The relaxed transport problem is defined 
as the limiting cost of the partial transport of masses 1 — e from (X, fi) to (Y, v), 
as e > tends to zero. 

The classical duality theorems of H. Kellerer, where c is lower semi-continuous 
or uniformly bounded, quickly follow from these general results. 
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1. Introduction 

We consider the Monge-Kantorovich transport problem for Borel probability measures fi, v 
on polish spaces X, Y. See [Vil03, Vil09 for an excellent account of the theory of optimal 
transportation. 

The set II(/x, v) consists of all Monge-Kantorovich transport plans, that is, Borel proba- 
bility measures on X x Y which have X-marginal /x and Y-marginal v. The transport costs 
associated to a transport plan tt are given by 



(1) (c,tt) = / c(x,y)dir(x,y). 

JXxY 

In most applications of the theory of optimal transport, the cost function c : X xY — > [0, oo] 
is lower semi-continuous and only takes values in M + . But equation |T|) makes perfect sense 
if the [0, oo]-valued cost function only is Borel measurable. We therefore assume throughout 
this paper that c : X x Y — » [0, oo] is a Borel measurable function which may very well 
assume the value +oo for "many" (x, y) € X x Y . 

An application where the value oo occurs in a natural way is transport between measures 
on Wiener space X = (C[0, 1], ||.||oo), where c(x,y) is the squared norm of x — y in the 
Cameron-Martin space, defined to be oo if x — y does not belong to this space. Hence in 
this situation the set {y : c(x,y) < oo} has ^-measure 0, for every x £ X, if the measure v 



is absolutely c ontinuous with respect to the Wiener measure on C[0, 1]. (See [FU02[ lFU04a 



Turning back to the general problem: the (primal) Monge-Kantorovich problem is to 
determine the primal value 

(2) P:=P c :=inf{(c,7r) : n e U(ji, u)} 
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and to identify a primal optimizer n £ H(/x, v). To formulate the dual problem, we define 

(3) = (M) : V' :^ [-o^oo)^ : y^ [-oo oo) inte^able, 

w v> ' \ YJ (p(x) +ip(y) < c(x, y) for all (x, y) £ X x Y. 

The dual Monge-Kantorovich problem then consists in determining 



(4) D := D c := sup ij ipdfi + J ipdv 

for (ip, ip) £ ^(/i, v). We say that Monge-Kantorovich duality holds true, or that there is no 
duality gap, if the primal value P of the problem equals the dual value D, i.e. if we have 

(5) inf{{c, 7i") : tt £ v)} = sup ij ipdfi + J ip dv : (tp, ip) £ ^(/x, v) 

There is a long line of research on these questions, initiated already by Kantorovich 
( |Kan42j ) himself and continued by numerous others (we mention |KR581 IDud76l IDud021 
ldM2l IGH8T1 lFeT8Tl ISiu82l IMIk06l IMT06] . see also the bibliographical notes in (VH09l p 86, 
87]). 

The validity of the above duality (J5J) was established in pleasant generality by H. Kellerer 
|Kel84j . He proved that there is no duality gap provided that c is lower semi-continuous (see 
[Kel84l Theorem 2.2]) or just Borel measurable and bounded by a constant f |Ke!84l Theo- 
rem 2.14]). In RR95, RR96] the problem is investigated beyond the realm of polish spaces 
and a characterization is given for which spaces duality holds for all bounded measurable 
cost functions. We also refer to the seminal paper |GM96| by W. Gangbo and R. McCann. 

We now present a rather trivial exampl^l which shows that, in general, there is a duality 
gap. 

Example 1.1. Consider X = Y = [0,1] and fi — v the Lebesgue measure. Define c on 
X x Y to be below the diagonal, 1 on the diagonal and oo else, i.e. 

( 0, for < y < x < 1, 
c(x, y) = < 1, for < x = y < 1, 
[ oo, for < x < y < 1. 

Then the only finite transport plan is concentrated on the diagonal and leads to costs of 
one so that P = 1. On the other hand, for admissible (tp,ip) £ ^(/i, ^), it is straightforward 
to check, that ip(x) + ip(x) > can hold true for at most countably many x £ [0, 1]. Hence 
the dual value equals D = 0, so that there is a duality gap. 

A common technique in the duality theory of convex optimisation is to pass to a relaxed 
version of the problem, i.e., to enlarge the sets over which the primal and/or dual functionals 
are optimized. We do so, for the primal problem @, by requiring only the transport of a 
portion of mass 1 — e from /i to v, for every e > 0. Fix < s < 1 and define 

n e ( M , v) = {tt £ M+(X x Y), |H| > 1 - s iPx (tt) < ^Py(tt) < v}. 

Here Ai+(X x Y) denotes the non-negative Borel measures tt on X x Y with norm ||7r|| = 
n(X x Y); by px{^) < M (resp. py{k) < v) we mean that the projection of 7r onto X 
(resp. onto Y) is dominated by [i (resp. v). We denote by P £ the value of the 1 — e partial 
transportation problem 

(6) P £ :=inf (<c,7r) = / c(x, y) dir(x, y) : ir £ n 6 (/i, u)\ . 

This partial transport problem has recently been studied by L. Caffarelli and R. McCann 
|CM06| as well as A. Figalli |Fig09 . In their work the emphasis is on a finer analysis of the 



or, more generally, by the sum f(x) + g(y) of two integrable functions /, g. 
2 This is essentially IKeI84l Example 2.5]. 
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Monge problem for the squared Euclidean distance on R™, and pertains to a fixed e > 0. In 
the present paper, we do not deal with these more subtle issues of the Monge problem and 
always remain in the realm of the Kantorovich problem (0) . Our emphasis is on the limiting 
behavior for e — > : we call 

(7) P r rcl := P rcl := lim P e 

the relaxed primal value of the transport plan. Obviously this limit exists (assuming possibly 
the value + oo) and P rcl < P. 

As a motivation for the subsequent theorem the reader may observe that, in Example 
11.11 above, we have P rcl = (while P = 1). Indeed, it is possible to transport the measure 
/il[ £l i] to the measure vlm,i-e] with transport cost zero by the partial transport plan ir = 
(id,id-e)# (a*1[ £ ,i])- 

We now can formulate our main result. 

Theorem 1.2. Let X, Y be polish spaces, equipped with Borel probability measures fi, v, and 
let c : X x Y — > [0, oo] be Borel measurable. 

Then there is no duality gap, if the primal problem is defined in the relaxed form Q) while 
the dual problem is formulated in its usual form Q . In other words, we have 

(8) P rcl = D. 

We observe that in ((5]) also the value +oo is possible. 

The theorem gives a positive result on the issue of duality in the Monge-Kantorovich 
problem. Moreover we have P = P ro1 and therefore P = D in any of the following cases. 

(a) c is lower semi-continuous, 

(b) c is uniformly bounded or, more generally, 

(c) c is /j, (g> v-st.s. finitely valued. 

Concerning (a) and (b), it is rather straight forward to check that these assumptions imply 
P = P rcl (see Corollaries 13.11 and 13.31 below). In particular, the classical duality results of 
Kellerer quickly follow from Theorem 11.21 To achieve that also property (c) is sufficient 
seems to be more sophisticated and follows from [BS091 Theorem 1]. 

A sufficient condition for attainment in the primal part of the Monge-Kantorovich trans- 
port problem is that the cost function c is lower semi-continuous and we have nothing to 
add here. 

To analyze the same question concerning the dual problem we need some preparation: 
consider the following alternative definition of P rel . One may relax the transport costs by 
cutting the maximal transport costs. I.e. we could alter the cost function c to c A M for 
some M > or to c A h for some [i <g> i/-a.s. finite, measurable function h : X X Y — > [0, oo]. 
If M resp. h is large this should have a similar effect as ignoring a small mass. Indeed we 
will establish that 

(9) lim P cAh „ - Pf 

n— ¥oo 

for any sequence of measurable functions h n : X x Y — > [0, oo) increasing (uniformly) to oo. 

In Theorem [33] below we then prove that we have dual attainment (in the sense of [BS09, 
Section 1.1]) if and only if there exists some finite measurable function h : X x Y — >• [0, oo) 
so that 

(10) PcAk - Pf- 

The paper is organized as follows. 

In Section [2] we show Theorem 11.21 The proof is self-contained with the exception of 
Lemma lA.ll which is a consequence of Kel84 ( Lemma 1.8]. For the convenience of the 
reader we provide a derivation of Lemma IA.1I in the Appendix. 
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Section deals with consequences of Theorem ll.21 First we re-derive the classical duality 
results of Kellerer. Then we establish (with the help of |BS09 , Theorem 2]) the alternative 
characterization of P rel given in © and the characterization of dual attainment via (|10D . 

2. The Proof of the Duality Theorem 

The proof of Theorem 11.21 relies on Fenchel's perturbation technique. We refer to the 
accompanying paper BLS09 for a didactic presentation of this technique: there we give 
an elementary version of this argument, where X = Y = {1,...,N} equipped with the 
uniform measure \i = v, in which case the optimal transport problem reduces to a finite 
linear programming problem. 

We start with an easy result showing that the relaxed version ^ of the optimal transport 
problem is not "too relaxed" , in the sense that the trivial implication of the minmax theorem 
still holds true. 

Proposition 2.1. Under the assumptions of Theorem ] 1/A we have 

P rcl > D. 

Proof. Let (tp, ip) be integrable Borel functions such that 

(11) <p(x) + tp(y) < c(x, y), for every (x, y) G X x Y. 

Let TT n G LT(/ n /Lt, g n v) be an optimizing sequence for the relaxed problem, where /„ < 
!><?« < 1) and Tr n (XxY) = ll/nllz, 1 ^) = HSnllz 1 ^) tends to one. By passing to a subsequence 
we may assume that (f n )^Li and (g n )%Li converge a.s. to 1. We may estimate 



liminf / cdn n > liminf 

n— >oo J x xY n— foo 



(pf n d/i+ I tpgndv 

x 



(pdfx+ J ip dv, 

x 



where in the last equality we have used Lebesgue's theorem on dominated convergence. □ 

The next lemma is a technical result which will be needed in the formalization of the 
proof of Theorem 11.21 

Lemma 2.2. Let V be a normed vector space, xq £ V, and let $ : V — » (— oo, oo] be a 
positively homogeneou^ convex function such that 

liminf > $(xo). 

|| ^ — 3=0 11—5-0 

If <&(xo) < oo then, for each e > 0, there exists a continuous linear functional v : V — > M 
such that 

$(xo) — e < v(xo) and $(x) > v(x), for all x G V. 

If &{xo) — oo then, for each M > 0, there exists a continuous linear functional v : V — > K 
such that 

M < v(x a ) and $(x) > v(x), for all x G V. 

Proof. Assume first that $(a;o) < oo. Let K = {(x,t) : x G V, t > Q(x)} be the epigraph 
of $ and K its closure in 7x1. Since $ is assumed to be lower semi continuous at Xq, 
we have infjt : (xo,t) G K} = $(xq), hence (xq,^(xq) — e) £ K. By Hahn-Banach, there 
is a continuous linear functional w G V* x R given by w(x,t) = u{x) + st (where u G V* 
and s G K) and f3 G E such that w(x, t) > (3 for (x, t) G K and w(x , ®(x ) - e) < /3. By 
the positive homogeneity of $, we have f3 < 0, hence s > 0. Also u(x) + s$(x) > (3 and 
by applying positive homogeneity once more we see that /3 can be replaced by 0. Hence we 
have 

u(x) + s$(x) > u(x ) + s($(x ) - e) < 0, 



By positively homogeneous we mean $(Ax) = A4>(i'), for A > 0, with the convention ■ oo = 0. 
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so just let v(x) := —u(x)/s. In the case $>(xq) — oo the assertion is proved analogously. □ 

We now define the function $ to which we shall apply the previous lemma. 

Let W — x L x {v) and V the subspace of co-dimension one, formed by the pairs 

(f,g) such that J x f dfi — j Y gdv. By V + = {(f,g) £ V : / > 0, g > 0} we denote the 
positive orthant of V. For (/, g) G V+, we define, by slight abuse of notation, g) as the 
set of non-negative Borel measures n on X x Y with marginals ffj, and gv respectively. With 
this notation 11(1, 1) is just the set n(/i, v) introduced above. Define $ : V+ — > [0, oo] by 



$(/,£?) = inN / c(x,y)dir(x,y) : tt e U(f,g) \, (J 1 g)eV+, 

UxxY ) 

which is a convex function. By definition we have $(1, 1) = P, where P is the primal value 
of @. Our matter of concern will be the lower semi-continuity of the function $ at the 
point (1, 1) G V+. 

Proposition 2.3. Denote by $ : V — > [0, oo] the lower semi- continuous envelope o/$, i.e., 
the largest lower semi-continuous function on V dominated by $ on V+. Then 

(12) ¥(1, 1) = P rel . 



Hence the function <E> is lower semi- continuous at (1, 1) if and only if P = P 1 



■i 



Proof. Let (7r rl )^° =1 be an optimizing sequence for the relaxed problem ([7|). i.e., a sequence 
of non-negative measures onlxr" such that 

lim / c(x,y)dn n (x, y) = P rel , 

n ^°°JxxY 



lim ||7r„|| = lim / 1 div n {x, y) = 1, 

n— >-oc n— >-oo JxxY 



and such that pjf (7r„) < /i and py(7r„) < v. In particular px(iVi) = /nA 4 an d Pr^n) = <?nM 
with (fn)^Li (resp. (<?n)^i) converging to 1 in the norm of i 1 (/i) (resp. i 1 (i/)). It follows 
that 

¥(1,1) < lim $(/„,<?„) = P rel . 

n— >oo 

To prove the reverse inequality $(1,1) > P ro1 , fix 6 > 0. We have to show that for each 
e > there is some 7r G n e (/x, i/) such that 



(13) $(1,1) + 5 > / ccftr. 

Pick 7 G (0,1) such that (1 - 7) 3 > 1 - e. Pick f,g and ?r G II(/,g) such that ||/ 
l|U 1 (/i)i \\g ~ < 7 an( i $(1) 1) + <5 > J cdn. We note for later use that ||7r|| 

= IMIi/ifi/) G (1 — 7> 1 + 7)- Define the Borel measure tt < t on X x 7 by 

dir 1 



^ v (1 + \f(x) -1|)(1+ | 5 (»)-1|)' 

and set /2 := px{n),v '■= Py{^)- As 3 1 < 1, we have 7r < 7r so that (fl~3|) is satisfied. Also 
/i < /i and v <v. Thus it remains to check that ||-7r|| > 1 — e. 

The function P(a, fe) = ( 1+a ) 1 ( 1+ ; ) ) is convex on [0, oo) 2 and by Jensen's inequality we have 

(14) HttII = IkH J F(\f(x) - 1|, \g(y) - 1|) ^ > 

(15) > IWIP (^^, ^!f^) > (1 - 7)(TT^)F ^ 1 - £ ' 
as required. 

The final assertion of the proposition is now obvious. □ 
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Proof of Theorem \1. 6 A By the preceding proposition we have to show that 

1(1,1) = D, 

where the dual value D of the optimal transport problem is defined in (|4|). 

By Lemma E2~2l we know that there are sequence^ (<£ n ) V'rOSLi G W* = L°°(fi) x L°°(v) 
such that 



lim ((tp n ,%jj n ), (1, 1)) = lim 



/ (p n dfi + 

JX JY 



Y 



$(1,1) G [0,oo], 



n— »-oo n— »-oo 

and such that 

(16) ((<Pn,fl>n),(f,9)) = (<P n ,f) + (1> n ,9)<*(J,9), for all (f,g)£V. 

We shall show that (fT5|) implies that, for each fixed n € N, there are representant^j {(p ni Tp n ) 
of (ip n ,ipn) such that 

(17) (p n {x) + i]j n (y) < c(x,y) 

for all (x,y) <E X x Y . Indeed, choose any K- valued representants (ip ni ip n ) of (tpm^n) an d 
consider the set 

(18) C = {{x, y)EX xY : <f> n {x) + $ n {y) > c(x, y)}. 

Claim: For every ir G II(/i, v) we have that 7r(C) = 0. 

Indeed, fix n G n(/i, z/) and denote by (/, g) the density functions of the projections 
Px{it\c) an d Py{t\c)' By (US]) we have, for n > 1, 

/ cl c dTr>^(f,g)>(ip n ,f) + (ip n ,g)= (ip n (x) + i> n (y))l c dir(x,y) 

JXxY JXxY 

By the definition of C the first term above can only be greater than or equal to the last 
term if n(C) = 0, which readily shows the above claim. 

Now we are in a position to apply an innocent looking, but deep result due to H. Kellerer 
|Kel84[ Lemma 1.8]j: a Borel set C = X x Y satisfies n(C) = 0, for each it G H((i,v), 
if and only if there are Borel sets M C X, N C Y with /Lt(M) = v{N) = such that 
C C (M x y) U (I x JV). Choosing such sets M and TV for the set C in ([Tg]), define the 
representants (ip ni ipn) by </3„ = if n ^-x\M — ooIm and ip n — ipn^-Y\N — ooIac. We then have 
(p n {x) + i> n {y) < c(x, y), for every (x, y) G X x Y. As 

lim / <p n dfi+ / ^) n dv = $(1, 1) = P Iel , 
n ^°°Jx Jy 

the proof of Theorem II. 21 is complete. □ 

3. Consequences of the Duality Theorem 

Assume first that the Borel measurable cost function c : X x Y — > [0, oo] is /z (g> ;/-almost 
surely bounded by some constanlQ M. We then may estimate 

P <P £ + sM. 



^The dual space V* of the subspace V of W = L 1 ^) X L 1 ^) equals the quotient of the dual L°°(fi) X 
L°° (v), modulo the annihilator of V, i.e. the one dimensional subspace formed by the (</?, ip) S L°° (/i)xL°° (u) 
of the form (ip, tp) = (a, —a), for a £ R. 

^Strictly speaking, (ip n ,i/j n ) are elements of L°°(/z) X L°°(u), i.e. equivalence classes of functions. The 
[— oo, oo[- valued Borel measurable functions ({p n ,il>n) will be properly chosen representants of these equiva- 
lence classes. 

''For the convenience of the reader and in order to keep the present paper self-contained, we provide in 
the appendix (Lemma lA.ll l a proof of Kcllercr's lemma, which is not relying on duality arguments. 
^In fact, the same argument works provided that c(x,y) < f(x) + g(y) for integrable functions /, g. 
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Indeed, for e > 0, every partial transport plan Tr e with marginals \x e < [i,v e < v and 
mass ||Tr e || = 1 — e may be completed to a full transport plan tt by letting, e.g., 

7T = n e + e^iji - if) ®(v- v £ ). 

As c < M we have J cdir < J cdir £ + eM. This yields the following corollary due to 
H. Kellerer |Kel841 Theorem 2.2]. 

Corollary 3.1. Let X, Y be polish spaces equipped with Borel probability measures fi, v, and 
let c : X x Y — > [0, oo] be a Borel measurable cost function which is uniformly bounded. Then 
there is no duality gap, i.e. P = D. 

To establish duality in the setup of a lower semi-continuous cost function c, it suffices to 
note that in this setting also the cost functional $ is lower semi- continuous: 

Lemma 3.2. |Vil091 Lemma 4.3] Let c : X x Y — > [0, oo] be lower semi-continuous and 
assume that a sequence of measures ir n on X xY converges to a transport plan it 6 II(/z, u) 
weakly, i.e. in the topology induced by the bounded continuous functions on X x Y . Then 

cdw < liminf / cd~K n . 



Corollary 3.3. [Kel84] Theorem 2.6] Let X, Y be polish spaces equipped with Borel proba- 
bility measures fi, v, and let c : X x Y — > [0, oo] be a lower semi- continuous cost function. 
Then there is no duality gap, i.e. P = D. 

Proof. It follows from Prokhorov's theorem and Lemma 13.21 that the function $ : V+ — > 
[0, oo] is lower semi-continuous with respect to the norm topology of V. □ 

We turn now to the question under which assumptions there is dual attainment. 

Easy examples show that one cannot expect that the dual problem admits integrable 
maximizers unless the cost function satisfies certain integrability conditions with respect 
to n and v (BB09, Examples 4.4, 4.5]. In fact [BS091 Example 4.5] takes place in a very 
"regular" setting, where c is squared Euclidean distance on R. In this case there exist natural 
candidates (tp, tp) which, however, fail to be dual maximizers in the usual sense as they are 
not integrable. 

The following solution was proposed in [BS091 Section 1.1]. If (f and tp are integrable 
functions and tt G II(/i, v) then 

(19) / ydn+ ( i/jdv= / (<^(x) + ip(y)) dn(x,y). 

Jx Jy JXxY 

If we drop the integrability condition on ip and ip, the left hand side need not make sense. 

But if we require that tp{x) + tp(y) ^ c ( x >y) an d if tt is a finite cost transport plan, i.e. 

IxxY edit < oo, then the right hand side of (|19[) still makes good sense, assuming possibly 

the value — oo, and we set 



J c (ip, 4>) = / (ip(x) + ip(y)) dn(x, y). 

JXxY 

It is not difficult to show (see [BS091 Lemma 1.1]) that this value does not depend on the 
choice of the finite cost transport plan tt and satisfies J c (f, V 1 ) ^ D. Under the assumption 
that there exists some finite transport plan tt 6 II(/i, v) we then say that we have dual 
attainment in the optimization problem Q if there exist Borel measurable functions (p : 

X — > [—00,00) and ip : Y ^ [—00,00) verifying (p(x) + ip(y) < c{x,y), for (x,y) G X x Y, 
such that 

(20) D = J c (0,$). 

We recall a result established in |BS09j . generalizing Corollary 13.11 We remark that we 
do not know how to directly deduce it from Theorem 1 1.2 1 
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Theorem 3.4. BS09, Theorems 1 and 2] Let X,Y be polish spaces equipped with Borel 
probability measures fj,, v , and let c : X x Y — > [0, oo] be a Borel measurable cost function 
such that fj, <S> v({(x,y) : c(x,y) = oo}) = 0. Then there is no duality gap, i.e. P = D. 
Moreover there exist Borel measurable functions ip : X — > [— 00,00),^ : Y — > [—00,00) so 
that tp(x) + ip{y) < c(x, y) for all x G X,y eF and J c {p, ip) = D. 

Using Theorem 13.41 we now obtain the alternative description of P rcl and the characteri- 
zation of dual attainment mentioned in the introduction. 

Theorem 3.5. Let X, Y be polish spaces, equipped with Borel probability measures [a, v , 
let c : X x Y — > [0, 00] be Borel measurable and assume that there exists a finite transport 
plan. For every sequence of measurable functions h n : X X Y — > [0, 00], satisfying h n y 00 
uniforml^ and where each h n is fi (g) v-a.s. finitely valued, we have 

(21) PcAh„ t ^' Cl . 

Moreover, the following are equivalent. 

(i) There is dual attainment, i.e. there exist measurable functions ip,ip such that tp(x) + 
ip(y) < c(x, y) for x G X,y EY and P rcl = D = J c (tp, ip). 

(ii) There exists a /i <g) v-a.s. finite function h : X x Y — > [0, 00] such that P rcI = P C Ah- 

Proof. Fix (h n ) n >o as in the Statement of the Theorem. To prove (f2"Tj) , note that by Theorem 
13.41 there exist, for each n, measurable functions tp n : X — > [— 00,00), ip n : Y — > [—00,00) 
satisfying ip n (x) + tp n (y) < c(x, y) for all x € X, y S Y so that 

Thus P cA/ln < P rcl for each n. To sec that Uirin^oo P cAft „ > P rcl , fix r) > 0. As D = P ro1 
there exists (ip,tp) G ^(/x, v) so that J(<p,ip) > P rcl — 77. Note that for, M > 0, the pair of 
functions (M A (— M V ip)), M A (— M V ip)) lies in ^>(fi, v). Hence we may assume without 
loss of generality that |<p| and |^| are uniformly bounded by some constant M. Pick n so 
that h n (x, y) > 2M for all x 6 X, y G Y. It then follows that c A h n (x, y) > <p(x) + ip(y) for 
all x G X, y G Y, hence 

PcAh n > J{V, 4>) > P 1Cl - V, 

which shows (EH). 

To prove that (ii) implies (i), apply Theorem 13.41 to the cost function c A h to obtain 
functions <p and ip satisfying (p(x) + ip(y) < (c A h){x,y) and J ' C hh{'~p ', V 1 ) = P=a/i- Then 
J c (yj, ^) = Pca/i = P lcl = P, hence ((p, ip) is a pair of dual maximizers. 

To see that (i) implies (ii), pick dual maximizers ip,ip and set h(x,y) := \<p{x) +ip(y)) + . 

□ 

We close this section with a comment concerning a possible relaxed version of the dual 
problem. 

Remark 3.6. Define 

!P „ (p,ip integrable, 1 

lip dfi + ip dv : ip(x)+ip(y)<c(x,y)<ir-a,.e. 
J J for every finite cost tt G n(/i, v) J 

where ir G n(/i, v) has finite cost if J XxY cdn < 00. It is straightforward to verify that we 
still have D rcl < P. One might conjecture (and the present authors did so for some time) 
that, similarly to the situation in Theorem 1 1.2I duality in the form D rel = P holds without 
any additional assumption. For instance this is the case in Example II. II and combining the 
methods of [BGMS09 and BSQ9] one may prove that P/ rcl = P provided that the Borel 



By saying that h n increases to 00 uniformly, we mean that for n large enough, h„ > m for every 
given constant m g [0, 00). Indeed it is crucial to insist on this strong type of convergence: one may easily 
construct examples where h n (x, y) f 00 for all (x, y) £ X X Y while Ph n = for every n £ N. 
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measurable cost function c : X x Y — > [0, oo] satisfies that the set {c = 00} is closed in the 
product topology of X x Y. However a rather complicated example constructed in [BLS09, 
Section 4] shows that under the assumptions of Theorem 11.21 it may happen that D Te is 
strictly smaller then P, i.e. that there still is a duality gap. 



In our proof of Theorem 11.21 we made use of the following innocent looking result due 
H. Kellerer: 

Lemma A.l. Let X,Y be polish spaces equipped with Borel probability measures \i^v, let 
L C X x Y be a Borel set and assume that 7r(L) = for any ir £ n(/i, v). Then there exist 
sets M CX,N cy such that fi(M) = v(N) = and L C M x Y U X x N. 

Lemma I A. II seems quite intuitive and, as we shall presently see, its proof is quite natural 
provided that the set L is compact. However the general case is delicate and relies on 
relatively involved results from measure theory. H. Kellerer proceeded as follows. First he 
established various sophisticated duality results. Lemma IA.1I is then a consequence of the 
fact that there is no duality gap in the case when the Borel measurable cost function c 
is uniformly bounded (Corollary 13.11) . To make the present paper more self-contained, we 
provide a direct proof of Lemma [A. II which does not rely on duality results. Still, most ideas 
of the subsequent proof are, at least implicitly, contained in the work of H. Kellerer. 

Some steps in the proof of Lemma lA.ll are (notationally) simpler in the case when (X, /i) = 
(Y, v) = ([0, 1], A), therefore we bring a short argument which shows that it is legitimate to 
make this additional assumption. 

Indeed it is rather obvious that one may reduce to the case that the measure spaces X 
and Y are free of atoms. A well known result of measure theory (see for instance |Kec95l 
Theorem 17.41]) asserts that for any polish space Z equipped with a continuous Borel 
probability measure er, there exists a measure preserving Borel isomorphism between the 
spaces (Z,cr) and ([0,1], A). Thus there exist bijections / : X — > [0,1], g : Y — > [0,1] which 
are measurable with measurable inverse, such that f#fx — g^v — A. Hence it is sufficient to 
consider the case (X, /1) = (Y, v) — ([0, 1], A) and we will do so from now on. 

For a measurable set L C [0, l] 2 we define the functional 



Our strategy is to show that under the assumptions of Lemma lA.ll we have that m(L) = 0. 
This implies Lemma [A. 1 1 since we have the following result. 

Lemma A. 2. Let L C X x Y be a Borel set with m(L) = 0. Then there exist sets M C 
X, N C Y such that fj,(M) = v(N) — and LCMxYUXxN. 

Proof. Fix £ > 0. Since m(L) = 0, there exist sets A n ,B n such that fi(A n ) < 1/n and 
v(B n ) < s2- n and LCA n xYUXx B n . Set A := H„>i K,B := \J n >i B n- Then 
fi(A) = 0, v(B) < e and 



Iterating this arguments with the roles of X and Y exchanged we get the desired conclu- 



Appendix A. Appendix 



m{L) := inf{A(A) + \{B) : L C A x Y U X x B}. 




sion. 



□ 



The next step proves Lemma [A. II in the case where L is compact. 



Lemma A. 3. Assume that K C [0,1] 2 is compact and satisfies ir(K) — for every ir £ 
n(A, A). Then m(K) = 0. 
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Proof. Assume that a := m(K) > 0. We have to show that there exists a non-trivial 
measure ir on X x Y, i.e. ir(K) > such that supp7r C K and the marginals of tt satisfy 
Px{Tt) < n, Py{w) < v. We aim to construct increasingly good approximations 7r„ of a such 
a measure. 

Fix n large enough and choose k > 1 such that a/3 < k/n < a/2. Since K is non-empty, 
there exist £ {0, . . . , n — 1} such that 

((^) + M 2 )ni^0. 

After m < k steps, assume that we have already chosen ji), . . . , (i m , j m )- Since 2m/n < 
a, we have that K is not covered by 

m m 

(U^^])x^ U ^x(U[t^])- 

1=1 1=1 

Thus there exist 

im+i e {0,...,n - 1} \ {h, . . .,i m },3m+i e {0, . . . ,n - 1} \ {ji, . . . ,j m } 

such that ^(^p-, ^7T~) + P> «] 2 ) H AT 7^ 0. After fc steps we define the measure 7r„ to be 
the restriction of n ■ A 2 , (i.e. the Lebesgue measure on [0, l] 2 multplied with the constant 
n) to the set ljf =1 ( — , ^) + [0, ^] 2 . Then the total mass of n n is bounded from below by 
k/n > a/3 and the marginals of 7r„ satisfy Px"(7Tn) < M) ^V(i"n) ^ v - These properties carry 
over to every weak-star limit point of the sequence (7r„) and each such limit point 7r satisfies 
supp 7r C K since K is closed. □ 

The next lemma will enable us to reduce the case of a Borel set L to the case of a compact 
set L. 

Lemma A. 4. Suppose that a Borel set L C [0, l] 2 satisfies m(L) > 0. Then there exists a 
compact set K C L such that m(K) > . 

Lemma lA. 41 will be deduced from Choquet's capacitability Theorem^ Before we formulate 
this result we introduce some notation. Given a compact metric space Z, a capacity on Z 
is a map 7 : V(Z) — >• R + such that: 

(1) ACB=> 7(A) < 1 {B). 

(2) A 1 C v4 2 C . . . => sup n > x 7 (A„) = 7(U„>i A.). 

(3) For every sequence K\ D K2 2 ... of compact sets we have inf„>i j(K n ) = 

7(n„>i^»)- 

The typical example of a capacity is the outer measure associated to a finite Borel measure. 

Theorem A. 5 (Choquet capacitability Theorem). See [ Cho59] and also [Kec951 Theorem 
30.13]. Assume that 7 is a capacity on a polish space Z. Then 

7(A) = sup{7(-R") : K C A, K compact} 

for every _Bore0 set A C Z . 

Proof of Lemma \A.^\ We cannot apply Theorem IA.5I directly to the functional m since m 
fails to be a capacity, even if it is extended in a proper way to all subsets of [0, l] 2 . A clever 
triclO is to replace m by the mapping 7 : V([0, l] 2 ) — > [0, 2], defined by 

j(L) := inf {jfdX:f: [0, 1] [0, l]J(x) + f{y) > l L (x, y) for (x,y) G [0, 1]}. 

We then have: 

^It seems worth noting that Kellerer also employs the Choquet capacitability Theorem. 
*"ln fact, the assertion of the Choquet capacitability Theorem is true for the strictly larger class of 
analytic sets. 

^We thank Richard Balka and Marton Elekes for showing us this argument (private communication). 
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a. For any Borel set A C [0, l] 2 we have j(L) < m(L) < 'i'j(L). 

b. 7 is a capacity. 

To see that (a) holds true notice that f(x) + f(y) > y) implies LC{/> 1/2} x 

FUlx{/> 1/2} and that L C AxY U X x B yields 1 A ub(x) + 1aub(v) > Il(x,v). 

To prove (b) it remains to check that 7 satisfies properties (2) and (3) of the capacity 
definition. To see continuity from below, consider a sequence of sets A\ C Ai C . . . increasing 
to A. Pick a sequence of functions /„ such that f n (x) + f n (y) > lA„{x,y) point-wise 
and J f dX < 7(A n ) + l/ra for each n > 1. By Komlos' Lemma there exist functions 
g n G conv{/ ra , /n+i, ■ • ■} such that the sequence (g n ) converges A-a.s. to a function g : [0, 1] — > 
[0, 1]. After changing jona A-null set if necessary, we have that g{x)+g(y) > 1a{x, y) point- 
wise. By dominated convergence, J gd\ = linin^oo J g n dX < lim^^oo j(A n ) + l/n = "f(A). 
Thus 7 satisfies property 2. The proof of (3) follows precisely the same scheme. 

An application of Choquet's Theorem I A. 5 1 now finishes the proof of Lemma [A. 41 □ 

We have done all the preparations to prove Lemma I A. 1 1 and now summarize the necessary 
steps. 

Proof of Lemma \A.l[ As discussed above, we may assume w.l.g. that (X,/j,) = (Y, v) — 
([0,1], A). Suppose that the Borel set L C [0,1] 2 satisfies tt(L) — for all 7r £ n(/i, v). 
Striving for a contradiction, we assume that m(L) > 0. By Lemma |A.41 we find that there 
exists a compact set K C L such that m(K) > 0. By Lemma IA.31 there is a measure 
7r G n(^i, v) such that tt(K) > 0, hence also n(L) > in contradiction to our assumption. 
Thus m{L) = 0. By Lemma I A. 2 1 we may conclude that there exist sets M C X, N C 
Y, fi(X) = v(N) = such that L C M xYUX x N hence we are done. □ 
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